Feature and model space speaker adaptation with full covariance Gaussians
نویسندگان
چکیده
Full covariance models can give better results for speech recognition than diagonal models, yet they introduce complications for standard speaker adaptation techniques such as MLLR and fMLLR. Here we introduce efficient update methods to train adaptation matrices for the full covariance case. We also experiment with a simplified technique in which we pretend that the full covariance Gaussians are diagonal and obtain adaptation matrices under that assumption. We show that this approximate method works almost as well as the exact method.
منابع مشابه
Feature and model space speaker adaptati
Full covariance models can give better results for speech recognition than diagonal models, yet they introduce complications for standard speaker adaptation techniques such as MLLR and fMLLR. Here we introduce efficient update methods to train adaptation matrices for the full covariance case. We also experiment with a simplified technique in which we pretend that the full covariance Gaussians a...
متن کاملSpeaker adaptation of convolutional neural network using speaker specific subspace vectors of SGMM
The recent success of convolutional neural network (CNN) in speech recognition is due to its ability to capture translational variance in spectral features while performing discrimination. The CNN architecture requires correlated features as input and thus fMLLR transform which is estimated in de-correlated feature space fails to give significant improvement. In this paper, we propose two metho...
متن کاملRapid speaker adaptation using MLLR and subspace regression classes
In recent years, various adaptation techniques for hidden Markov modeling with mixture Gaussians have been proposed, most notably MAP estimation and MLLR transformation. When the amount of adaptation data is limited, adaptation can be done by grouping similar Gaussians together to form regression classes and then transforming the Gaussians in groups. The grouping of Gaussians is often determine...
متن کاملLarge vocabulary conversational speech recognition with a subspace constraint on inverse covariance matrices
This paper applies the recently proposed SPAM models for acoustic modeling in a Speaker Adaptive Training (SAT) context on large vocabulary conversational speech databases, including the Switchboard database. SPAM models are Gaussian mixture models in which a subspace constraint is placed on the precision and mean matrices (although this paper focuses on the case of unconstrained means). They i...
متن کاملDimensional reduction, covariance modeling, and computational complexity in ASR systems
In this paper, we study acoustic modeling for speech recognition using mixtures of exponential models with linear and quadratic features tied across all context dependent states. These models are one version of the SPAM models introduced in [1]. They generalize diagonal covariance, MLLT, EMLLT, and full covariance models. Reduction of the dimension of the acoustic vectors using LDA/HDA projecti...
متن کاملذخیره در منابع من
با ذخیره ی این منبع در منابع من، دسترسی به آن را برای استفاده های بعدی آسان تر کنید
عنوان ژورنال:
دوره شماره
صفحات -
تاریخ انتشار 2006